Dictionary Look-Up within Small Edit Distance

نویسندگان

  • Abdullah N. Arslan
  • Ömer Egecioglu
چکیده

Let W be a dictionary consisting of n binary strings of length m each, represented as a trie. The usual d-query asks if there exists a string in W within Hamming distance d of a given binary query string q. We present an algorithm to determine if there is a member in W within edit distance d of a given query string q of length m. The method takes time O(dm d+1) in the RAM model, independent of n, and requires O(dm) additional space.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient approximate dictionary look-up over small alphabets

Given a dictionary W consisting of n binary strings of length m each, a d-query asks if there exists a string in W within Hamming distance d of a given binary query string q. The problem was posed by Minsky and Papert in 1969 [10] as a challenge to data structure design. Efficient solutions have been developed only for the special case when d = 1 (the 1-query problem). We assume the standard RA...

متن کامل

Compressed String Dictionary Look-Up with Edit Distance One

In this paper we present different solutions for the problem of indexing a dictionary of strings in compressed space. Given a pattern P , the index has to report all the strings in the dictionary having edit distance at most one with P . Our first solution is able to solve queries in (almost optimal) O(|P |+ occ) time where occ is the number of strings in the dictionary having edit distance at ...

متن کامل

Lexical Access via Phoneme to Grapheme Conversion

The Lexical Access (LA) problem in Computer Science aims to match a phoneme sequence produced by the user to a correctly spelled word in a lexicon, with minimal human intervention and in a short amount of time. Lexical Access is useful in the case where the user knows the spoken form of a word but cannot guess its written form or where the users best guess is inappropriate for look-up in a stan...

متن کامل

Phrase-Based Statistical Machine Translation Using Approximate Matching

Phrase-based statistical models constitute one of the most competitive pattern-recognition approaches to machine translation. In this case, the source sentence is fragmented into phrases, then, each phrase is translated by using a stochastic dictionary. One shortcoming of this phrase-based model is that it does not have an adequate generalization capability. If a sequence of words has not been ...

متن کامل

Algorithme de recherche approximative dans un dictionnaire fondé sur une distance d'édition définie par blocs

We propose an algorithm for approximative dictionary lookup, where altered strings are matched against reference forms. The algorithm makes use of a divergence function between strings— broadly belonging to the family of edit distances; it finds dictionary entries whose distance to the search string is below a certain threshold. The divergence function is not the classical edit distance (DL dis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002